Add batched simplices_containing query and Rust default_loss for LearnerND hot loops by basnijholt · Pull Request #13 · python-adaptive/adaptive-triangulation

basnijholt · 2026-06-10T20:16:52Z

Summary

Profiling LearnerND with the Rust backend active (shipped in adaptive 1.5.0, adaptive#493) shows that only ~14% (2D) / ~25% (3D) of the remaining runtime is spent inside this extension — roughly half is LearnerND's own Python. Two of the remaining hotspots are triangulation-shaped rather than learner-shaped, so they belong here:

component (3D `sphere_of_fire`, 1500 pts)	share of runtime
adaptive's own Python (`learnerND.py`)	48%
Rust extension calls	25%
sortedcontainers (simplex queue)	10%
numpy/scipy + other	17%

tell_pending's neighbour loop: for every pending point, adaptive loops over all simplices sharing a vertex with the containing simplex and calls tri.point_in_simplex one at a time (~14 checks per point in 2D, ~70 in 3D, ~96% misses) — a barycentric solve plus FFI round-trip per check.
default_loss: a Python wrapper that tuple-splices vertices and values before calling simplex_volume_in_embedding; the wrapper costs 3-7x the Rust math it wraps.

New APIs

`Triangulation.simplices_containing(point, simplex=None, candidates=None, eps=1e-8)`

All simplices containing point, as a sorted list of tuples. Instead of one solve per neighbour, it solves once for a simplex known to contain the point (the simplex hint when given and valid — mapping directly onto tell_pending's simplex argument — else locate_point), reduces it to the face the point actually lies on, and returns the simplices containing that face via the existing facet index. A stale/wrong hint safely falls back to locating the point. Passing candidates instead filters those through point_in_simplex, preserving the original per-candidate semantics. Error parity with point_in_simplex (ZeroDivisionError / numpy.linalg.LinAlgError).

`default_loss(simplex, values, value_scale=None)`

The LearnerND default loss (embedded simplex volume) taking the scaled vertex/value arrays directly, with bulk-copy fast paths for 1-D/2-D f64 numpy arrays (exactly what _compute_loss passes). Signature-compatible with adaptive's loss_per_simplex functions, so the adaptive side can swap it in with one import; value_scale is accepted and unused, like the reference.

Performance

End-to-end LearnerND on adaptive 1.5.0 with both wired in the way the adaptive-side integration would (examples/learnernd_batched_apis.py):

2D ring_of_fire, 3000 points:
  baseline (Rust backend, adaptive >= 1.5)     0.46s  (1.00x)
  + rust default_loss                          0.41s  (1.12x)
  + simplices_containing tell_pending          0.44s  (1.04x)
  + both                                       0.40s  (1.17x, identical points)

3D sphere_of_fire, 1500 points:
  baseline (Rust backend, adaptive >= 1.5)     0.74s  (1.00x)
  + rust default_loss                          0.65s  (1.15x)
  + simplices_containing tell_pending          0.62s  (1.19x)
  + both                                       0.53s  (1.40x, identical points)

Every configuration samples identical points to the baseline. The win grows with dimension (vertex stars get bigger), which is exactly where LearnerND is used.

Deliberately not moved here: the simplex priority queue (generic data structure, fixable in adaptive with a lazy-deletion heap) and the pending-point/subtriangulation bookkeeping (that would absorb LearnerND's state machine and break user-pluggable loss functions).

Also includes a README benchmark refresh against adaptive 1.5.0: the usage section now leads with the automatic backend selection (pip install "adaptive[rust]"), and the end-to-end table was re-measured — LearnerND's pure-Python path got faster in 1.5.0, so the honest headline ratio is now 3.3× (was 3.7×).

Testing

16 new tests in tests/test_batched_queries.py: brute-force parity in 2D/3D/4D (including vertex probes), equivalence with the exact tell_pending neighbour loop it replaces, hint/stale-hint/empty-hint behaviour, candidates filtering, eps forwarding, and default_loss parity against adaptive.learner.learnerND.default_loss for scalar/vector values, numpy arrays, and plain lists.
Full suite: 131 passed against the locked adaptive (CI uses uv sync --locked, adaptive 1.3.2). Against adaptive 1.5.0, everything passes except the pre-existing test_learnernd_with_neighbor_aware_loss_runs failure: 1.5.0 auto-selects the Rust backend, which surfaces the known collinear simplex_volume_in_embedding divergence (raises where the reference returns 0.0) — unrelated to this change, needs its own fix before bumping the lock.
cargo clippy -D warnings, cargo fmt, ruff (0.11.0) all clean.

Follow-up: a small adaptive-side PR can adopt both — tell_pending collapses to one simplices_containing call, and triangulation_backend re-exports default_loss.

Profiling LearnerND with the Rust backend (adaptive PR #493) showed only 14-25% of runtime left inside this extension; two of the remaining Python hotspots are triangulation-shaped and move here: - Triangulation.simplices_containing(point, simplex=None, candidates=None, eps=1e-8): all simplices containing a point in one call. Instead of one barycentric solve per neighbouring simplex (the loop LearnerND.tell_pending runs in Python today, ~14 checks per point in 2D, ~70 in 3D), it solves once for a simplex known to contain the point (the hint, or locate_point), reduces it to the face the point lies on, and looks up the simplices containing that face in the facet index. - default_loss(simplex, values, value_scale=None): the LearnerND default loss (embedded simplex volume) taking the scaled vertex/value arrays directly, signature-compatible with loss_per_simplex. End-to-end with both wired into LearnerND (examples/learnernd_batched_apis.py): 1.18x in 2D and 1.39x in 3D on top of the Rust backend, with identical sampled points.

adaptive 1.5.0 ships the automatic Rust-backend selection, so the usage section now leads with `pip install "adaptive[rust]"` (monkey-patching is only needed for < 1.5.0) and the example points at the release instead of the PR branch. Re-measured all tables against 1.5.0: standalone numbers are unchanged within noise; LearnerND's pure-Python path got faster in 1.5.0, so the honest end-to-end ratio is now 3.3x (was 3.7x). Adds the batched-API numbers (1.17x 2D / 1.40x 3D on top of the backend) to the performance section.

basnijholt added 2 commits June 10, 2026 13:16

basnijholt merged commit d65ac17 into main Jun 10, 2026
17 checks passed

basnijholt mentioned this pull request Jun 10, 2026

chore: bump version to 0.3.0 #14

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add batched simplices_containing query and Rust default_loss for LearnerND hot loops#13

Add batched simplices_containing query and Rust default_loss for LearnerND hot loops#13
basnijholt merged 2 commits into
mainfrom
batched-queries-and-default-loss

basnijholt commented Jun 10, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

basnijholt commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New APIs

Triangulation.simplices_containing(point, simplex=None, candidates=None, eps=1e-8)

default_loss(simplex, values, value_scale=None)

Performance

Testing

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

basnijholt commented Jun 10, 2026 •

edited

Loading

`Triangulation.simplices_containing(point, simplex=None, candidates=None, eps=1e-8)`

`default_loss(simplex, values, value_scale=None)`